Hands-on Exercise 3a - Programming Interactive Data Visualisation with R

Published

January 22, 2024

Modified

January 26, 2024

3.1 Learning Outcome

In this exercise, we will learn how to create interactive data visualization using functions provided by ggiraph and plotly packages.

3.2 Getting Started

First, write a code chunk to check, install and launch the following R packages:

  • ggiraph for making ‘ggplot’ graphics interactive.

  • plotly, R library for plotting interactive statistical graphs.

  • DT provides an R interface to the JavaScript library DataTables that create interactive table on html page.

  • tidyverse, a family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs.

  • patchwork for combining multiple ggplot2 graphs into one figure.

pacman::p_load(ggiraph, plotly, listviewer,
               patchwork, DT, tidyverse) 

3.3 Importing Data

We will use read_csv() of readr package to import Exam_data.csv into R.

exam_data <- read_csv("data/Exam_data.csv")

3.4 Interactive Data Visualization- ggiraph methods

ggiraph is a html widget and ggplot2 extension. It enables ggplot graphics to be interactive.

Interactivity is achieved when ggplot geoms can understand 3 arguments:-

  • Tooltip: a column of data-sets that contain tooltips to be displayed when the mouse is over elements.

  • Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.

  • Data_id: a column of data-sets that contain an id to be associated with elements.

When used within a shiny application, elements associated with an id (data_id) can be selected and manipulated on client and server sides.

3.4.1 Tooltip effect with tooltip aesthetic

We can use the below code to plot an interactive statistical graph by using ggiraph package.

The code chunk consists of two parts:-

  1. First, a ggplot object will be created.

  2. Next, girafe() of ggiraph will be used to create an interactive svg object.

p <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE, 
    binwidth = 1, 
    method = "histodot") +
  scale_y_continuous(NULL, 
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)
Note

First, an interactive version of ggplot2 geom (i.e. geom_dotplot_interactive()) will be used to create the basic graph. Then, girafe() will be used to generate an svg object to be displayed on an html page.

3.5 Interactivity

When we hover the mouse pointer to a data point, the student’s ID will be displayed

3.5.1 Displaying multiple information on tooltip

The specific content or box for the tooltip can be customized using a list object as shown in the code below

exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS,
  "\n Race = ", exam_data$RACE)) 

p1 <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p1,
  width_svg = 8,
  height_svg = 8*0.618
)

The first three lines of codes in the code creates a new field called tooltip. At the same time, it populates text in ID, CLASS and RACE fields into the newly created field. Then, this newly created field is used as tooltip field.

3.6 Interactivity

By hovering the mouse pointer on a data point, the student’s ID, class and race will be displayed.

3.6.1 Customizing Tooltip style

The code below uses opts_tooltip() of ggiraph to customize tooltip rendering by adding css declarations.

tooltip_css <- "background-color:blue; #<<
font-style:bold; color:white;" #<<

p3 <- ggplot(data=exam_data, 
       aes(x = SCIENCE)) +
  geom_dotplot_interactive(              
    aes(tooltip = exam_data$tooltip),                   
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p3,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(    #<<
    opts_tooltip(    #<<
      css = tooltip_css)) #<<
)                                        

The background colour of the tooltip is blue and the font is white and bold.

3.6.2 Displaying statistics on tooltip

The code below shows an advanced way to customize the tooltip. A function is used to compute 90% confidence interval of the mean. The derived statistics are then displayed in the tooltip.

tooltip <- function(y, ymax, accuracy = .01) {
  mean <- scales::number(y, accuracy = accuracy)
  sem <- scales::number(ymax - y, accuracy = accuracy)
  paste("Mean English scores:", mean, "+/-", sem)
}

gg_point <- ggplot(data=exam_data, 
                   aes(x = RACE),
) +
  stat_summary(aes(y = ENGLISH, 
                   tooltip = after_stat(  
                     tooltip(y, ymax))),  
    fun.data = "mean_se", 
    geom = GeomInteractiveCol,  
    fill = "light green"
  ) +
  stat_summary(aes(y = ENGLISH),
    fun.data = mean_se,
    geom = "errorbar", width = 0.2, size = 0.2
  )

girafe(ggobj = gg_point,
       width_svg = 8,
       height_svg = 8*0.618)

3.6.3 Hover effect with data_id aesthetic

The code below shows another interactive feature of ggiraph, data_id.

p4 <- ggplot(data=exam_data, 
       aes(x = SCIENCE)) +
  geom_dotplot_interactive(           
    aes(data_id = CLASS),             
    stackgroups = TRUE,               
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p4,                             
  width_svg = 6,                         
  height_svg = 6*0.618                      
)                                        

Interactivity: Elements associated with a data_id (CLASS) will be highlighted upon mouse over.

The default value of the hover css is hover_css = “fill:orange;”.

3.6.4 Styling hover effect

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #800080;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)                                        

Interactivity: Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over.

Note

Different from previous example, in this example the ccs customization is encoded directly.

3.6.5 Combining tooltip and hover effect

We can use the code below to combine both tooltip and hover effects on the interactive stastistical graph below.

p5 <- ggplot(data=exam_data, 
       aes(x = SCIENCE)) +
  geom_dotplot_interactive(              
    aes(tooltip = CLASS, 
        data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p5,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #800080;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)                                        

Interactivity: Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over. At the same time, the tooltip will show the CLASS.

3.6.6 Click effect with onclick

onclick argument of ggiraph provides hotlink interactivity on the web

The code below shows an example of onclick.

exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(onclick = onclick),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618)                                        

Interactivity: Web document link with a data object will be displayed on the web browser upon mouse click.

Warning

Note that click actions must be a string column in the data set containing valid javascript instructions

3.6.7 Coordinated Multiple Views with ggiraph

We can also implement multiple views for data visualization, like below.

When a data point of one of the dotplot is selected, the corresponding data point ID on the second data visualization will also be highlighted as well.

In order to build a coordinated multiple views as shown in the example above, the following programming strategy will be used:

  1. Appropriate interactive functions of ggiraph will be used to create the multiple views.

  2. patchwork function of patchwork package will be used inside girafe function to create the interactive coordinated multiple views.

p_1 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +  
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

p_2 <- ggplot(data=exam_data, 
       aes(x = SCIENCE)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") + 
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

girafe(code = print(p_1 + p_2), 
       width_svg = 6,
       height_svg = 3,
       options = list(
         opts_hover(css = "fill: #800080;"),
         opts_hover_inv(css = "opacity:0.2;")
         )
       ) 

The data_id aesthetic is critical to link observations between plots and the tooltip aesthetic is optional but nice to have when we mouse over a point.

3.7 Interactive Data Visualization - plotly method

Plotly’s R graphing library creates interactive web graphics from ggplot2 graphs and/or a custom interface to the (MIT-licensed) JavaScript library plotly.js inspired by the grammar of graphics. Different from other plotly platform, plot.R is free and open source.

There are two ways to create interactive graph by using plotly:

  • by using plot_ly(), and

  • by using ggplotly()

3.7.1 Creating an interactive scatter plot: plot_ly() method

The tabset below shows a basic interactive plot.

plot_ly(data = exam_data, 
             x = ~MATHS, 
             y = ~ENGLISH)

3.7.2 Working with visual variable: plot_ly() method

In the code below, colour argument is mapped to a qualitative visual variable (eg RACE).

plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE)

3.7.3 Creating an interactive scatter plot: ggplotly() method

The code below plots an interactive scatter plot using ggplotly().

p <- ggplot(data=exam_data, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
ggplotly(p)

3.7.4 Coordinated multiple views with plotly

Creation of a coordinated linked plot using plotly has three steps:

  • highlight_key() of plotly package is used as shared data.

  • two scatterplots will be created by using ggplot2 functions.

  • lastly, subplot() of plotly package is used to place them next to each other side-by-side.

Click on a data point of one of the scatterplot and see how the corresponding point on the other scatterplot is selected.

d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))

Thing to learn from the code:

3.8 Interactive Data Visualization - crosstalk methods

Crosstalk is an add-on to the htmlwidgets package. It extends htmlwidgets with a set of classes, functions, and conventions for implementing cross-widget interactions (currently, linked brushing and filtering).

3.8.1 Interactive Data Table: DT package

  • A wrapper of the JavaScript Library DataTables

  • Data objects in R can be rendered as HTML tables using the JavaScript library ‘DataTables’ (typically via R Markdown or Shiny).

DT::datatable(exam_data, class= "compact")

3.8.2 Linked brushing: crosstalk method

The code below is used to implement the coordinated brushing shown above

d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  

crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)        

Things to learn from the code chunk:

  • highlight() is a function of plotly package. It sets a variety of options for brushing (i.e., highlighting) multiple plots. These options are primarily designed for linking multiple plotly graphs, and may not behave as expected when linking plotly to another htmlwidget package via crosstalk. In some cases, other htmlwidgets will respect these options, such as persistent selection in leaflet.

  • bscols() is a helper function of crosstalk package. It makes it easy to put HTML elements side by side. It can be called directly from the console but is especially designed to work in an R Markdown document. Warning: This will bring in all of Bootstrap!.

3.9 Plotting Practice

Below are some sample practice plots based on the lessons learnt in this Hands on Exercise.

1) Interactive Histogram.

Show the code
tooltip_css <- "background-color:blue; #<<
font-style:bold; color:white;" #<<

exam_data$tooltip1 <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS,
  "\n Race = ", exam_data$RACE)) 


h <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_histogram_interactive(
    aes(tooltip = exam_data$tooltip1),
    stackgroups = TRUE, 
    binwidth = 1) +
  scale_y_continuous(NULL, 
                     breaks = NULL)
girafe(
  ggobj = h,
  width_svg = 6,
  height_svg = 6*0.618,
  options = list(    #<<
    opts_tooltip(    #<<
      css = tooltip_css)) #<<
)                              

2) Combining tooltip and hover effect

Show the code
p6 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_bar_interactive(              
    aes(tooltip = CLASS, 
        data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1) +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p6,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #800080;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)                                        

3) setting colors and shapes for interactive plot via plot_ly() method.

Show the code
# First, make sure 'RACE' is a factor with the levels in the order you want
exam_data$RACE <- factor(exam_data$RACE, levels = c("Chinese", "Indian", "Malay", "Others"))

# Now, map the levels of RACE to the symbols
symbol_numbers <- c(0, 1, 2, 3)  # This should correspond to the levels of RACE

# Generate the plot
plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE,
        symbol = ~RACE,  # This will map the levels of RACE to different symbols
        marker = list(symbols = symbol_numbers)  # Define the symbol numbers
)

3.10 References

3.10.1 ggiraph

This link provides online version of the reference guide and several useful articles. Use this link to download the pdf version of the reference guide.

3.10.2 plotly for R

Main reference: Kam, T.S. (2023). Programming Interactive Data Visualisation with R.